An Evaluation of SPARQL Federation Engines Over Multiple Endpoints
نویسندگان
چکیده
Due to decentralized and linked architecture underlying Linking Data, running complex queries often require collecting data from multiple RDF datasets. The optimization of the runtime of such queries, called federated queries, is of central importance to ensure the scalability of Semantic-Web and Linked-Data-driven applications. This has motivated a considerable body of work on SPARQL query federation. However, previous evaluations of SPARQL query federation engines do not evaluate the performance of these engines pertaining to the different steps involved in the federated query processing. Consequently, it is difficult to pinpoint the components of the federation engines that need to be improved. This work presents an extended summary of the fine-grained evaluation of SPARQL endpoint federation systems performed in [13]. Beside query runtime as an evaluation criterion, we extend the scope of our performance evaluation by considering additional measures which are important but have not been paid much attention to in the previous studies. Our experimental outcomes lead to novel insights for improving current and future SPARQL federation systems.
منابع مشابه
Tracking Federated Queries in the Linked Data
Federated query engines allow data consumers to execute queries over the federation of Linked Data (LD). However, as federated queries are decomposed into potentially thousands of subqueries distributed among SPARQL endpoints, data providers do not know federated queries, they only know subqueries they process. Consequently, unlike warehousing approaches, LD data providers have no access to sec...
متن کاملSharing Statistics for SPARQL Federation Optimization, with Emphasis on Benchmark Quality
Federation of semantic data on SPARQL endpoints will allow data to remain distributed so that it can be controlled by local curators and swiftly updated. There are considerable performance problems, which the present work proposes to address, mainly by computation and exposure of statistical digests to assist selectivity estimation. For an objective evaluation as well as comparison of engines, ...
متن کاملBioFed: federated query processing over life sciences linked open data
BACKGROUND Biomedical data, e.g. from knowledge bases and ontologies, is increasingly made available following open linked data principles, at best as RDF triple data. This is a necessary step towards unified access to biological data sets, but this still requires solutions to query multiple endpoints for their heterogeneous data to eventually retrieve all the meaningful information. Suggested ...
متن کاملQuerying over Federated SPARQL Endpoints - A State of the Art Survey
The increasing amount of Linked Data and its inherent distributed nature have attracted significant attention throughout the research community and amongst practitioners to search data, in the past years. Inspired by research results from traditional distributed databases, different approaches for managing federation over SPARQL Endpoints have been introduced. SPARQL is the standardised query l...
متن کاملHow Interlinks Influence Federated over SPARQL Endpoints
As the Web of Data grows, the number of available SPARQL endpoints increases. SPARQL endpoints conceptually represent RPC-style, coarse-grained data access mechanisms. Nevertheless, through the potential interlinking of the contained entities, SPARQL endpoints should be able to over distinct advantages over plain Web APIs. To our knowledge, to date, there has been no study conducted that gauges...
متن کامل